Spatial data is any type of data that directly or indirectly references a specific geographical area or location. These can be, for example, geographic features on the landscape or environmental properties of an area such as temperature or air quality.
Spatial data can be continuous or discrete just like regular data, and in both cases it can be represented as a vector or a raster. The main difference being vector data uses points and lines to represent spatial data, while raster data represents data uses pixelled or gridded, where each pixel/cell represents a specific geographic location and the information therein. Raster data will be heavily influenced by the size of the pizels/cells, i.e. resolution.
Both vector and raster data are planar representations of the world, a 3-dimensional sphere, and as such are not perfect copies. Depending on how the planar representation is created it will distort more or less certain areas of the world, therefore many representations exist. These are called projections, as the representations project the 3 dimensional spheric image into a planar, 2-dimensional image.
Maps with different projections are not comparable and cannot be overlaid. Therefore, we need to make sure we work always on the same projection when using more maps. In addition, projections can have different coordinate systems and therefore when extracting distance information from maps, some projections (i.e. metric based projections e.g. Mercator projection or the Albers equal-area projection) will give a more accurate representation of the distance than others. To work around this, we can transform our maps between projections, in R we use EPSG codes to do this.
In this section we will read and manipulate vector data in R. * Vector data represents real world features within the GIS environment. A feature is anything you can see on the landscape. * Vector data is commonly stored as a shapefile and can contain either point data or polygon data. * Data contains information attached to each feature, we call these attributes.
Features can be points (red) representing specific x,y locations, such as a trees or camera sites; polygons (white) representing areas, such as forests or residential areas; and lines (yellow/green and blue) representing continuous linear features, such as roads or rivers
Vector data reads as a data frame would, each row is a feature and each column is an attribute, and contains usually a geometry column where the xy coordinates for the shapes are stored. Plotting these data will plot the points or shapes in the map using the xy coordinates stored for each feature.
## Linking to GEOS 3.11.2, GDAL 3.6.2, PROJ 9.2.0; sf_use_s2() is TRUE
## Simple feature collection with 6 features and 3 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: -122.3607 ymin: 47.64339 xmax: -122.3607 ymax: 47.64339
## Geodetic CRS: WGS 84
## speciesname locationid date geometry
## 1 Procyon lotor SEWA_N01_DRP2 2019-07-03 05:02:21 POINT (-122.3607 47.64339)
## 2 Procyon lotor SEWA_N01_DRP2 2019-07-03 06:04:45 POINT (-122.3607 47.64339)
## 3 Procyon lotor SEWA_N01_DRP2 2019-07-03 06:12:10 POINT (-122.3607 47.64339)
## 4 Procyon lotor SEWA_N01_DRP2 2019-07-03 06:13:16 POINT (-122.3607 47.64339)
## 5 Procyon lotor SEWA_N01_DRP2 2019-07-05 03:55:53 POINT (-122.3607 47.64339)
## 6 Procyon lotor SEWA_N01_DRP2 2019-07-05 04:05:21 POINT (-122.3607 47.64339)
Packages used to read an manipulate data include the sf package, that reads the shapefile as a spatial data frame, and the terra package that reads the shapefiles as a Spatvector, previously there was also the raster package, but we will try to avoid it as it has been deprecated.
library(sf)
library(terra)
Point data can be obtained directly from a shapefile or a csv file where each row is a feature. In this case we will work with camera trap site data and the information collected at each site, i.e. point.
The camera trap sites here are located in Seattle, and have captured coyote and raccoon presence and absence from the 2019 spring season to the 2021 winter season.
The data is stored as a data frame in a csv.
captures.table <- read.csv("data/captures.csv")
print(head(captures.table))
## speciesname locationid date latitude longitude
## 1 Procyon lotor SEWA_N01_DRP2 2019-07-03 05:02:21 47.64339 -122.3607
## 2 Procyon lotor SEWA_N01_DRP2 2019-07-03 06:04:45 47.64339 -122.3607
## 3 Procyon lotor SEWA_N01_DRP2 2019-07-03 06:12:10 47.64339 -122.3607
## 4 Procyon lotor SEWA_N01_DRP2 2019-07-03 06:13:16 47.64339 -122.3607
## 5 Procyon lotor SEWA_N01_DRP2 2019-07-05 03:55:53 47.64339 -122.3607
## 6 Procyon lotor SEWA_N01_DRP2 2019-07-05 04:05:21 47.64339 -122.3607
The coordinates are stored in the latitude and longitude, to be able to observe these points in the map, and extract environmental information based on their location, we will have to convert it to a spatial data frame object. We will use the st_as_sf() function from the sf package and we will specify the projection (crs). How do we know which projection our data is in?
captures.spatial <- st_as_sf(captures.table,
coords = c("longitude","latitude"),
crs = 4326)
print(head(captures.spatial))
## Simple feature collection with 6 features and 3 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: -122.3607 ymin: 47.64339 xmax: -122.3607 ymax: 47.64339
## Geodetic CRS: WGS 84
## speciesname locationid date geometry
## 1 Procyon lotor SEWA_N01_DRP2 2019-07-03 05:02:21 POINT (-122.3607 47.64339)
## 2 Procyon lotor SEWA_N01_DRP2 2019-07-03 06:04:45 POINT (-122.3607 47.64339)
## 3 Procyon lotor SEWA_N01_DRP2 2019-07-03 06:12:10 POINT (-122.3607 47.64339)
## 4 Procyon lotor SEWA_N01_DRP2 2019-07-03 06:13:16 POINT (-122.3607 47.64339)
## 5 Procyon lotor SEWA_N01_DRP2 2019-07-05 03:55:53 POINT (-122.3607 47.64339)
## 6 Procyon lotor SEWA_N01_DRP2 2019-07-05 04:05:21 POINT (-122.3607 47.64339)
We want our data to be in the NAD83 projection, because we need our data in the UTM coordinate system to be compatible with google map #I dont know if I understood correctly the comment below but wrote a draft anyways. # Transform to UTM # Will only plot on google map if it’s in lat/lon, so we need to think about # where we introduce this.
captures.utm <- st_transform(captures.spatial, crs = 26910)
print(head(captures.utm))
## Simple feature collection with 6 features and 3 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: 548015.5 ymin: 5276863 xmax: 548015.5 ymax: 5276863
## Projected CRS: NAD83 / UTM zone 10N
## speciesname locationid date geometry
## 1 Procyon lotor SEWA_N01_DRP2 2019-07-03 05:02:21 POINT (548015.5 5276863)
## 2 Procyon lotor SEWA_N01_DRP2 2019-07-03 06:04:45 POINT (548015.5 5276863)
## 3 Procyon lotor SEWA_N01_DRP2 2019-07-03 06:12:10 POINT (548015.5 5276863)
## 4 Procyon lotor SEWA_N01_DRP2 2019-07-03 06:13:16 POINT (548015.5 5276863)
## 5 Procyon lotor SEWA_N01_DRP2 2019-07-05 03:55:53 POINT (548015.5 5276863)
## 6 Procyon lotor SEWA_N01_DRP2 2019-07-05 04:05:21 POINT (548015.5 5276863)
Let’s observe the spatial distribution of the points by plotting them using the ggplot2 package. The geom_sf() function will allow us to plot the spatial data frame object.
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.3.2
ggplot(captures.utm) + geom_sf()
There is no basemap in this plot, we want to add a reference so we can easily distinguish between locations. We will use google maps for this, first we load the ggmap package and register our api from google. # Use an API key for the ‘uwin-mapping’ project that I created for this # Describe in the Rmd how to get your own # setup API key to use
library(ggmap)
## The legacy packages maptools, rgdal, and rgeos, underpinning the sp package,
## which was just loaded, will retire in October 2023.
## Please refer to R-spatial evolution reports for details, especially
## https://r-spatial.org/r/2023/05/15/evolution4.html.
## It may be desirable to make the sf package available;
## package maintainers should consider adding sf to Suggests:.
## The sp package is now running under evolution status 2
## (status 2 uses the sf package in place of rgdal)
## ℹ Google's Terms of Service: <https://mapsplatform.google.com>
## ℹ Please cite ggmap if you use it! Use `citation("ggmap")` for details.
my_api <- 'AIzaSyBt73bzxdvlS6ioit4OTCaIE6SrZJ9aWnA'
register_google(key = my_api)
We then get the map relevant to our region using the get_map() function. This can be done both using a bounding box with coordinate information if we want a specific study area, or just the city’s name.
seattle <- get_map("seattle", source= "google", api_key = my_api)
## ℹ <https://maps.googleapis.com/maps/api/staticmap?center=seattle&zoom=10&size=640x640&scale=2&maptype=terrain&language=en-EN&key=xxx>
## ℹ <https://maps.googleapis.com/maps/api/geocode/json?address=seattle&key=xxx>
ggmap(seattle)
If we use a bounding box, the code will look like this:
seattle <- get_map(location = c(left = -122.5, bottom = 47.4,
right = -122.0, top = 47.8),
source ="google", api_key = my_api)
## ! Bounding box given to Google - spatial extent only approximate.
## ℹ <https://maps.googleapis.com/maps/api/staticmap?center=47.6,-122.25&zoom=11&size=640x640&scale=2&maptype=terrain&language=en-EN&key=xxx>
ggmap(seattle)
Now we can plot our camera site locations on the Seattle map #note the original crs works with google, not the utm.
ggmap(seattle) +
geom_sf(data=captures.spatial, inherit.aes = FALSE)
## Coordinate system already present. Adding new coordinate system, which will
## replace the existing one.
Now lets plot on a map the coyotes captured at each the camera trap
sites. We will filter the data based on species name, using the dplyr
package, and count detections at each site. We will then plot using the
function seen above, but setting point size based on the number of
detections at each site.
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
coyotes <- filter(captures.spatial, speciesname == "Canis latrans") %>%
group_by(locationid) %>%
summarize(detections = n())
ggmap(seattle) +
geom_sf(data = coyotes, inherit.aes = FALSE, aes(size = detections)) +
ggtitle("Coyote detections") +
labs(size = "Detection frequency") +
scale_size_continuous(breaks=seq(100, 500, by=100))
## Coordinate system already present. Adding new coordinate system, which will
## replace the existing one.
Now do the same for raccoons # Hide this using the method from the UWIN tutorial # https://github.com/urbanwildlifeinstitute/UWIN_tutorials/blob/main/tutorials/Detection%20Mapping/detection_mapping.md?plain=1
raccoons <- filter(captures.spatial, speciesname == "Procyon lotor") %>%
group_by(locationid) %>%
summarize(detections = n())
ggmap(seattle) +
geom_sf(data = raccoons, inherit.aes = FALSE, aes(size = detections)) +
ggtitle("Coyote detections") +
labs(size = "Detection frequency") +
scale_size_continuous(breaks=seq(100, 500, by=100))
## Coordinate system already present. Adding new coordinate system, which will
## replace the existing one.